AITopics

2504.1385

Country: Asia > Japan (0.24)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Besias, Spyridon, Sertaridis, Ilias, Afentaki, Florentia, Balaskas, Konstantinos, Zervakis, Georgios

Late Breaking Results: Energy-Efficient Printed Machine Learning Classifiers with Sequential SVMs

arXiv.org Artificial IntelligenceJan-28-2025

Printed Electronics (PE) provide a mechanically flexible and cost-effective solution for machine learning (ML) circuits, compared to silicon-based technologies. However, due to large feature sizes, printed classifiers are limited by high power, area, and energy overheads, which restricts the realization of battery-powered systems. In this work, we design sequential printed bespoke Support Vector Machine (SVM) circuits that adhere to the power constraints of existing printed batteries while minimizing energy consumption, thereby boosting battery life. Our results show 6.5x energy savings while maintaining higher accuracy compared to the state of the art.

artificial intelligence, classifier, machine learning, (14 more...)

2501.16828

Country: Europe > Greece > West Greece > Patra (0.05)

Genre: Research Report > New Finding (0.55)

Industry:

Energy > Energy Storage (0.56)
Electrical Industrial Apparatus (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.60)

Putra, Rachmad Vidya Wicaksana, Hanif, Muhammad Abdullah, Shafique, Muhammad

RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults

arXiv.org Artificial IntelligenceApr-8-2023

To maximize the performance and energy efficiency of Spiking Neural Network (SNN) processing on resource-constrained embedded systems, specialized hardware accelerators/chips are employed. However, these SNN chips may suffer from permanent faults which can affect the functionality of weight memory and neuron behavior, thereby causing potentially significant accuracy degradation and system malfunctioning. Such permanent faults may come from manufacturing defects during the fabrication process, and/or from device/transistor damages (e.g., due to wear out) during the run-time operation. However, the impact of permanent faults in SNN chips and the respective mitigation techniques have not been thoroughly investigated yet. Toward this, we propose RescueSNN, a novel methodology to mitigate permanent faults in the compute engine of SNN chips without requiring additional retraining, thereby significantly cutting down the design time and retraining costs, while maintaining the throughput and quality. The key ideas of our RescueSNN methodology are (1) analyzing the characteristics of SNN under permanent faults; (2) leveraging this analysis to improve the SNN fault-tolerance through effective fault-aware mapping (FAM); and (3) devising lightweight hardware enhancements to support FAM. Our FAM technique leverages the fault map of SNN compute engine for (i) minimizing weight corruption when mapping weight bits on the faulty memory cells, and (ii) selectively employing faulty neurons that do not cause significant accuracy degradation to maintain accuracy and throughput, while considering the SNN operations and processing dataflow. The experimental results show that our RescueSNN improves accuracy by up to 80% while maintaining the throughput reduction below 25% in high fault rate (e.g., 0.5 of the potential fault locations), as compared to running SNNs on the faulty chip without mitigation. In this manner, the embedded systems that employ RescueSNN-enhanced chips can efficiently ensure reliable executions against permanent faults during their operational lifetime.

artificial intelligence, machine learning, opération, (15 more...)

doi: 10.3389/fnins.2023.1159440

2304.04041

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Education (0.69)
Information Technology (0.68)
Semiconductors & Electronics (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Architecture (1.00)

#artificialintelligenceSep-17-2022, 18:05:11 GMT

SambaNova Doubles Up Chips To Chase AI Foundation Models

One of the first tenets of machine learning, which is a very precise kind of data analytics and statistical analysis, is that more data beats a better algorithm every time. A consensus is emerging in the AI community that a large foundation model with hundreds of billions to trillions of parameters is going to beat a highly tuned model on a small subset of relevant data every time. If this turns out to be true, it will have significant implications for AI system architecture as well as who will likely be able to afford having such ginormous foundation models in production. Our paraphrasing of "more data beats a better algorithm" is a riff on a quote from Peter Norvig, an education fellow at Stanford University and a researcher and engineering director at Google for more than two decades, who co-authored the seminal paper The Unreasonable Effectiveness of Data back in 2009, long before machine learning went mainstream but when big data was amassing and changing the nature of data analytics and giving great power to the hyperscalers who gathered it as part of the services they offered customers. "But invariably, simple models and a lot of data trump more elaborate models based on less data," Norvig wrote, and since that time, he has been quoted saying something else: "More data beats clever algorithms, but better data meets more data."

artificial intelligence, machine learning, natural language, (16 more...)

Country: Asia > Taiwan (0.04)

Industry: Information Technology > Services (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.72)

Putra, Rachmad Vidya Wicaksana, Hanif, Muhammad Abdullah, Shafique, Muhammad

SoftSNN: Low-Cost Fault Tolerance for Spiking Neural Network Accelerators under Soft Errors

arXiv.org Artificial IntelligenceMar-11-2022

Specialized hardware accelerators have been designed and employed to maximize the performance efficiency of Spiking Neural Networks (SNNs). However, such accelerators are vulnerable to transient faults (i.e., soft errors), which occur due to high-energy particle strikes, and manifest as bit flips at the hardware layer. These errors can change the weight values and neuron operations in the compute engine of SNN accelerators, thereby leading to incorrect outputs and accuracy degradation. However, the impact of soft errors in the compute engine and the respective mitigation techniques have not been thoroughly studied yet for SNNs. A potential solution is employing redundant executions (re-execution) for ensuring correct outputs, but it leads to huge latency and energy overheads. Toward this, we propose SoftSNN, a novel methodology to mitigate soft errors in the weight registers (synapses) and neurons of SNN accelerators without re-execution, thereby maintaining the accuracy with low latency and energy overheads. Our SoftSNN methodology employs the following key steps: (1) analyzing the SNN characteristics under soft errors to identify faulty weights and neuron operations, which are required for recognizing faulty SNN behavior; (2) a Bound-and-Protect technique that leverages this analysis to improve the SNN fault tolerance by bounding the weight values and protecting the neurons from faulty operations; and (3) devising lightweight hardware enhancements for the neural hardware accelerator to efficiently support the proposed technique. The experimental results show that, for a 900-neuron network with even a high fault rate, our SoftSNN maintains the accuracy degradation below 3%, while reducing latency and energy by up to 3x and 2.3x respectively, as compared to the re-execution technique.

artificial intelligence, machine learning, opération, (17 more...)

doi: 10.1145/3489517.3530657

2203.05523

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

#artificialintelligenceMar-8-2022, 10:30:17 GMT

GraphCore Goes Full 3D With AI Chips

The 3D stacking of chips has been the subject of much speculation and innovation in the past decade, and we will be the first to admit that we have been mostly thinking about this as a way to cram more capacity into a given compute engine while at the same time getting components closer together along the Z axis and not just working in 2D anymore down on the X and Y axes. It was extremely interesting to see, then, the 3D wafer-on-wafer stacking that AI chip and system upstart GraphCore has been working on with Taiwan Semiconductor Manufacturing Co had nothing to do making logic circuits more dense within a socket. This will happen over time, of course, but the 3D wafer stacking that GraphCore and TSMC have been exploring together and are delivering in the third generation "Bow" GraphCore IPU – the systems based on them bear the same nickname – is about creating a power delivery die that is bonded to the bottom of the existing compute die. The effect of this innovation is that GraphCore can get a more even power supply to the IPU, and therefore it can drop the voltage on its circuits and therefore increase the clock frequency while at the same time burning less power. The grief and cost of doing this power supply wafer and stacking the IPU wafer on top are outweighed by the performance and thermal benefits on the IPU, and therefore GraphCore and its customers come out ahead on the innovation curve.

graphcore, ipus, wafer, (16 more...)

Country:

Asia > Taiwan (0.25)
Europe > United Kingdom > England > Buckinghamshire > Milton Keynes (0.05)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Hardware (0.36)

Technology: Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceFeb-23-2020, 23:17:42 GMT

Google Teaches AI To Play The Game Of Chip Design

If it wasn't bad enough that Moore's Law improvements in the density and cost of transistors is slowing. At the same time, the cost of designing chips and of the factories that are used to etch them is also on the rise. Any savings on any of these fronts will be most welcome to keep IT innovation leaping ahead. One of the promising frontiers of research right now in chip design is using machine learning techniques to actually help with some of the tasks in the design process. We will be discussing this at our upcoming The Next AI Platform event in San Jose on March 10 with Elias Fallon, engineering director at Cadence Design Systems.

chip design, google, ip block, (12 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.04)

Industry: Semiconductors & Electronics (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceOct-29-2019, 15:14:59 GMT

Deploying a ML Model on Google Compute Engine - WebSystemer.no

Flask is not a web server. It is a micro web application framework, a set of tools and libraries that make it easier and prettier to build web applications. Flask comes with Werkzeug, a WSGI utility library that provides a simple web server for development purposes. While Flask's development server is good enough to test the main functionality of the app, we shouldn't use it in production. While lightweight and easy to use, Flask's built-in server is not suitable for production as it doesn't scale well and by default serves only one request at a time.

compute engine, configuration file, server, (11 more...)

Technology:

Information Technology > Communications (0.64)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Falahati, Hajar, Lotfi-Kamran, Pejman, Sadrosadati, Mohammad, Sarbazi-Azad, Hamid

ORIGAMI: A Heterogeneous Split Architecture for In-Memory Acceleration of Learning

arXiv.org Machine LearningJan-9-2019

Memory bandwidth bottleneck is a major challenges in processing machine learning (ML) algorithms. In-memory acceleration has potential to address this problem; however, it needs to address two challenges. First, in-memory accelerator should be general enough to support a large set of different ML algorithms. Second, it should be efficient enough to utilize bandwidth while meeting limited power and area budgets of logic layer of a 3D-stacked memory. We observe that previous work fails to simultaneously address both challenges. We propose ORIGAMI, a heterogeneous set of in-memory accelerators, to support compute demands of different ML algorithms, and also uses an off-the-shelf compute platform (e.g.,FPGA,GPU,TPU,etc.) to utilize bandwidth without violating strict area and power budgets. ORIGAMI offers a pattern-matching technique to identify similar computation patterns of ML algorithms and extracts a compute engine for each pattern. These compute engines constitute heterogeneous accelerators integrated on logic layer of a 3D-stacked memory. Combination of these compute engines can execute any type of ML algorithms. To utilize available bandwidth without violating area and power budgets of logic layer, ORIGAMI comes with a computation-splitting compiler that divides an ML algorithm between in-memory accelerators and an out-of-the-memory platform in a balanced way and with minimum inter-communications. Combination of pattern matching and split execution offers a new design point for acceleration of ML algorithms. Evaluation results across 12 popular ML algorithms show that ORIGAMI outperforms state-of-the-art accelerator with 3D-stacked memory in terms of performance and energy-delay product (EDP) by 1.5x and 29x (up to 1.6x and 31x), respectively. Furthermore, results are within a 1% margin of an ideal system that has unlimited compute resources on logic layer of a 3D-stacked memory.

compute engine, ml algorithm, platform, (14 more...)

arXiv.org Machine Learning

1812.11473

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

#artificialintelligenceJun-1-2018, 17:11:13 GMT

More Details Emerge About Arm's Machine Learning

Arm is definitely targeting deep-neural-network (DNN) machine-learning (ML) applications with its proposed hardware designs, but its initial ML hardware descriptions were a bit vague (Figure 1). Though the final details aren't ready yet, ARM has exposed more of the architecture. The Arm ML processor is supported by the company's Neural Network (NN) software development kit that bridges the interface between ML software and the underlying hardware. This allows developers to target Arm's CPU, GPU, and ML processors. In theory, waiting for the ML hardware will allow a critical mass of software to be available when the real hardware finally arrives.

artificial intelligence, engine, machine learning, (15 more...)

Industry: Information Technology (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)